Recognition of Multiple Language Voice Navigation Queries in Traffic Situations
نویسندگان
چکیده
This paper introduces our work and results related to a multiple language continuous speech recognition task. The aim was to design a system that introduces tolerable amount of recognition errors for point of interest words in voice navigational queries even in the presence of real-life traffic noise. Additional challenges were that no task-specific training databases were available for language and acoustic modeling. Instead, general purpose acoustic database were obtained and (probabilistic) context free grammars were constructed for the acoustic and language models, respectively. Public pronunciation lexicon was used for the English language, whereas ruleand exception dictionary based pronunciation modeling was applied for French, German, Italian, Spanish and Hungarian. For the last four languages the classical phoneme-based pronunciation modeling approach was compared to grapheme-based pronunciation modeling technique, as well. Noise robustness was addressed by applying various feature extraction methods. The results show that achieving high word recognition accuracy is feasible if cooperative speakers can be assumed.
منابع مشابه
Geo-location for voice search language modeling
We investigate the benefit of augmenting with geo-location information the language model used in speech recognition for voice-search. We observe reductions in perplexity of up to 15% relative on test sets obtained from both typed query data, as well as transcribed voice search data; on a subset of the test data consisting of “local” queries — search results displaying a restaurant, some addres...
متن کاملDetection and Recognition of Multi-language Traffic Sign Context by Intelligent Driver Assistance Systems
Design of a new intelligent driver assistance system based on traffic sign detection with Persian context is concerned in this paper. The primary aim of this system is to increase the precision of drivers in choosing their path with regard to traffic signs. To achieve this goal, a new framework that implements fuzzy logic was used to detect traffic signs in videos captured along a highway f...
متن کاملEfficient Correction Interfaces for Speech Recognition
The recognition of speech by computers is a challenging task and recognition errors are ultimately unavoidable. Error correction is thus a crucial part of any speech recognition interface. In this thesis, I look at how to improve the correction process in speech recognition. Before errors can be corrected, they must first be detected. I look at improving error detection by visualizing the recog...
متن کاملNatural Language Interfaces over Spatial Data: Investigations in Scalability, Extensibility and Reliability
This thesis focuses primarily on constructing voice-only pedestrian guidance systems using spatial database techniques. In the process of doing this we first explored how to use authoring tools to build natural language interfaces over large databases. Specifically we built a natural language interface over the MusicBrainz database of 1.5GB and confronted the resulting scalability issues. We th...
متن کاملA Real Time Traffic Sign Detection and Recognition Algorithm based on Super Fuzzy Set
Advanced Driver Assistance Systems (ADAS) benefit from current infrastructure to discern environmental information. Traffic signs are global guidelines which inform drivers from near characteristics of paths ahead. Traffic Sign Recognition (TSR) system is an ADAS that recognize traffic signs in images captured from road and show information as an adviser or transmit them to other ADASs. In this...
متن کامل